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Preface 


Methods  employed  when  testing  hypotheses  concerning 
populations  must  consider  the  possibility  of  correlation 
existing  among  the  variables  that  characterize  the  entity 
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I  would  also  like  to  thank  my  reader,  Lt  Col  Richard 
Kulp,  whose  advice  and  editing  aided  me  greatly;  Mr.  Jerry 
Petrak  and  the  Engineering  and  Design  Data  Group,  AFWAL/MLSE, 
for  supplying  the  experimental  data  and  assisting  in  its 
analysis;  and  my  typist,  Phyllis  Reynolds,  for  her  superb 
job. 

Finally,  I  want  to  express  a  special  thanks  to 
my  wife,  Lois,  for  her  dedication  and  support;  and  our 
children,  Robert,  Daniel,  and  Cristina,  who  were  a  con¬ 
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Abstract 


Let  a  random  sample  of  size  N  be  drawn  from  a 

g 

p-variate  normal  population  N  (}jg,  E  )  g  =  (1,2, 

r  y 

In  this  thesis  we  consider  the  problem  of  testing  the  fol¬ 
lowing  hypotheses: 


[i]  HQ:  y1  =  y2  =  ...  =  yk. 


=  1  "  *2  "  ‘ 


‘  =  *k 


[ii]  H^:  Z^  =  Z2  =  ...  =  Z^.  The  means  can  be  any  value 

[iii]  H2:  y1  =  y2  =  yk  given  Z^  »  Z2  =  ...  = 

against  the  general  alternatives. 

Likelihood  ratio  criteria  and  their  sampling  dis¬ 
tributions  are  derived  for  p  =  2  and  equal  sample  sizes. 
From  these  distributions,  tables  of  percentage  points  for 
the  three  likelihood  ratio  criteria  are  computed. 

A  useful  approximation  is  also  obtained.  The  theo¬ 
retical  results  are  then  applied  to  actual  data. 
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ON  THE  DISTRIBUTION  OF  THE  LIKELIHOOD  RATIO 
CRITERIA  ASSOCIATED  WITH  K  SAMPLES  OF 
TWO  CORRELATED  RANDOM  VARIABLES 

I.  Introduction 

Statistical  techniques  enable  experimenters  to 
analyze  the  variation  and  covariation  that  exists  between 
the  measured  characteristic  of  observed  events.  Analysts 
seek  to  assign  causes  to  this  variation,  test  and  compare 
alternative  hypothesis  and  express  the  results  in  terms  of 
a  measure  of  probability.  Some  of  these  hypotheses  are: 

(1)  Is  the  sample  from  a  specified  population?  (2)  Are  the 
k  samples  from  a  common  but  unspecified  population?  (3)  Is 
the  population  completely  specified  or  only  partly?  (4)  Do 
several  populations  with  different  means  have  the  same 
standard  deviations?  (5)  Are  the  variables  being  tested 
correlated? 

One  approach,  the  Analysis  of  Variance,  developed 
by  R.  A.  Fisher,  is  based  on  the  assumption  that  the  unex¬ 
plained  variation  (residuals)  -  is  normally  independently 
distributed  and  the  populations  have  the  same  standard 
deviation.  The  assumption  that  the  standard  deviations  are 
the  same  is  not  always  true  and,  therefore,  the  results 


obtained  could  be  misleading  to  the  user  of  the  information 
Multivariate  ar  lysis  theory  is  well  suited  in  this  case, 
specifically  where  two  or  more  correlated  variables  are 
involved. 


Background 

In  early  developments  of  hypothesis  testing,  the 
fundamental  hypothesis,  H:  Are  the  two  samples  and 
from  the  same  unknown  normal  population  k,  was  treated  by 
Professor  V.  Romanovsky  in  his  paper  entitled  "On  Criteria 
that  Two  Given  Samples  Belong  to  the  Same  Normal  Population 
(Ref  14).  Romanovsky  approached  the  problem  assuming  the 
hypothesis  H  to  be  true  and  derived  the  distribution  func¬ 
tion  for  his  test  criteria.  He  provided  four  alternative 
criteria  for  testing  h is  hypothesis  H.  These  criteria  are 
as  follows: 


V 


(1.2) 


t  = 


X1  "  X2 


2  2 
1S1  +  n2S2 


-\l  nln2lnl  + 

V  n,  +  n„ 


-  2) 


(1.3) 


2 


e 


(1.4) 


Neyman  and  Pearson  (Ref  11:201)  point  out  that  the 
criterion  a,  y ,  and  t  are  not  sensitive  to  differences  in 
population  standard  deviations.  For  example,  the  pairs  of 
samples  may  have  s^  and  s2  almost  equal,  whereas  a  second 
pair,  s^  and  could  vary  greatly,  yet  the  value  of  t  may 
be  the  same  in  both  cases.  The  criteria  0,  does  dis¬ 
tinguish  between  the  population  standard  deviations,  but  is 
not  sensitive  to  the  difference  between  their  means. 

Because  of  the  restriction  on  these  criterion,  further 
research  is  necessary  to  derive  a  test  statistic.  The 
test  statistic  should  have  the  following  properties:  (1)  be 
able  to  distinguish  between  population  standard  deviations 
and  between  their  means,  and  (2)  the  test  statistic  should 
be  selected  in  such  a  manner  that  it  will  minimize  the 
danger  of  accepting  a  false  hypothesis. 

Neyman  and  Pearson  use  the  likelihood  ratio  test  to 
derive  a  test  statistic  for  one  variable  and  two  popula¬ 
tions  that  satisfies  the  above  requirements. 

The  Likelihood  Ratio  Criterion 
of  Neyman  and  Pearson 

R.  A.  Fisher  in  the  early  1920s  proposed  a  general 
method  of  estimation  called  the  method  of  maximum  likeli¬ 
hood  from  which  the  likelihood  ratio  criteria  for  testing 
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hypotheses  was  developed.  The  method  produced  sufficient 
estimates  for  the  parameters  whenever  they  existed  and  the 
estimates  are  asymptotically,  (n  ->  °°)  ,  minimum  variance 
unbiased  estimators. 

Now  we  will  discuss  the  likelihood  ratio  criterion 
of  Neyman  and  Pearson. 

Let  the  stochastically  independent  random  variables 
and  X2  be  chosen  from  some  normal  populations  and  k2 

where  the  means  and  variances  are  any  values.  Then  our 

12  1 
parameter  space  ft  =  (y  ,  y  ,  E^,  E2)  ,  where  (-»  <  y  <  °°)  , 

2 

(-00  <  jj  <  »)  ,  (0  <  Z^  <  00)  ,  (0  <  E2  <  «>)  .  We  wish  to  test 

1  2 

the  hypothesis  H^:  y  =  y,  E^  =  Z2  a9ainst  aH  alterna- 

1  2 

tives.  Under  HQ  let  w  be  such  that  (-  °°  <  y  =  y  <  °°) 
and  (0  <  E^  ==  Z2  <  «)  .  Let  L(ft)  and  L(to)  define  the  likeli- 

A  A 

hood  function  for  ft  and  to  respectively  and  L(ft)  and  L(to) 
be  their  maxima.  Then  Neyman  and  Pearson  obtained  the 
likelihood  ratio  criteria  for  testing  Hq  in  the  form 
(Ref  11:103) : 


XH  =  kloj!  = 

L  (ft ) 


-  lnlrs  l1 

*1  1  f2 

3o-l  Lso. 


where  n^  is  the  sample  size  for  X^, 
n2  is  the  sample  size  for  X2, 

is  the  standard  deviation  for  X^, 
s2  is  the  standard  deviation  for  X2,  and 


r- 


•v 


s 


o 


is  the  standard  deviation  obtained  by  combining 
the  n.  and  n_  variables  of  the  samples  X..  and 

X2* 


Our  criteria  X„  lies  between  0  and  1.  If  our  hypo- 
0 

thesis  Hq  is  true  we  would  expect  the  ratio  of  L(u)  to  L(fi) 

to  approach  unity.  The  closer  to  unity  the  more  confidence 

we  have  that  Hn  is  true.  However,  if  \  approaches  zero 
0  H0 

we  become  more  certain  that  the  hypothesis  Hq  is  false. 

The  nature  of  the  hypothesis  allows  us  to 
separate  it  into  two  hypotheses:  (1)  :  The  samples  come 

from  unknown  populations  with  the  same  variance,  but  with 
means  having  any  value  whatever;  and  (2)  The  means  are 

the  same,  assuring  equal  variances  and  normal  populations. 

If  we  use  y  and  t  from  equations  (1.2)  and  (1.3) 
and  the  equation 


/  2  2. 

2  (nisi  +  n2s2) 

S  =  + 

(nl  +  n2J 


^1^0  _  _  2 

2  *X1  ~  X2* 


<nl  +  n2) 


then  can  be  represented  as  a  function  of  Romanovsky's 

n 

criteria  t  and  0  from  equations  (1.3)  and  (1.4) . 

(n1  +  n2)  n2  (nx  +  n2) 

\  =  (n^  +  n2)  2  02  (nj^  +  n20)  2 


(nl  +  n2} 
2 


(1.5) 
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From  equation  (1.5)  Neyman  and  Pearson  (Ref  11:104) 


derived  the  likelihood  function  criteria  for  testing 
and  H2«  Thus,  the  likelihood  of  H.^  is 


(nl  +  n2>  ^2  .  (nl  +  n2> 

XH,  =  (n.  +  n_)  2  6 2  (n.  +  n-0)  2 

112  12  (1.6) 

The  likelihood  of  H2  is 

(nl  +  n2} 

2  (1.7) 


Combining  x  and  X„  the  results  are 
H1  H2 

X  -  X„  X„  (1.8) 

H1  ”2 


H~  =  1  + 


nl  +  n2  '  2 


From  equation  (1.5)  it  can  be  seen  that  X„  obtains 

n 

its  maximum  value  of  unity  when  both  0  =  1  and  t  =  0,  or 
S1  =  s2  an<*  X1  =  X2'  Xh  decrease  towards  zero  when 

a)  0  0  or  s2  becomes  small  compared  with  s^ . 

b)  0  00  or  s^  becomes  small  compared  with  s2« 

c)  |t|  <®  or  |x  -  x2|  increases  compared  with 


Thus,  even  if  x,  =  x,  or  L  =  1  we  cannot  accept 
1  1  H2 

Hrt  if  s,  differs  considerably  from  s_.  If  s,  =  s_,  (A„  =1) 

U  1  x  Z 

then  the  populations  are  not  the  same  if  (x^  -  x^)  were 
large  compared  to  V,  which  is  the  estimate  based  on  the 
sample  variance  of  the  standard  error  of  the  differences 
of  means. 

Thus,  the  criterion  A„  =  A„  X„  is  more  crucial  than 

H  H2 

either  X„  or  X„  taken  separately.  Therefore,  our  conclu- 
H1  H2 

sion  is  that  A„  is  a  reasonable  criterion  to  use  for  mea- 

n 

suring  the  danger  of  accepting  a  false  hypothesis. 

To  control  the  error  of  rejecting  a  true  hypothesis, 

it  is  necessary  to  determine  the  sampling  distribution  of 

A„.  The  distributions  are  derived  for  A„  ,  and  A„  and  an 
H  H1  H2 

approximation  for  XH  in  Neyman  and  Pearson's  paper  "On  the 

Problem  of  Two  Samples"  (Ref  11:106-109). 

The  extension  of  Romanovsky ' s  work  to  k  samples  of 
a  single  variable  was  undertaken  by  Neyman  and  Pearson's 
article  "On  the  Problem  of  k  Samples"  (Ref  12) .  The  likeli¬ 
hood  function  for  X„  ,  X  ,  and  A„  were  derived  by  general- 

H  n  2  xi 

izing  the  two  sample  deviation.  However,  methods  for 

calculating  the  distribution  of  A  and  A.  were  not  avail- 

n  u  ^ 

able  at  the  time  of  the  article.  In  this  case,  approximate 
solutions  of  the  problem  were  reached  by  use  of  the  moment 
coefficients  of  the  A  's  expressions. 

H 
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Objective 

A  further  generalization  of  the  problem  of  k 
samples  of  two  variables  is  treated  by  Pearson  and  Wilks 
(Ref  13) .  Specifically,  the  problem  treated  is  the  case  of 
two  correlated  variables  x  and  y  which  have  a  bivariate 
normal  distribution.  The  three  hypotheses  considered  are: 

1.  The  hypothesis  HQ  that  the  k  populations  are 
identical . 

2.  The  hypothesis  that  the  samples  have  come 
from  populations  with  the  same  set  of  variances  and  corre¬ 
lations,  but  having  means  with  any  differing  values  what¬ 
ever. 

3.  The  hypothesis  H2  that  the  samples  are  from 
populations  in  which  the  means  are  equal,  when  it  is 
assumed  that  the  variances  and  covariances  are  equal. 

Testing  these  hypotheses  are  of  great  interest  to 
industry;  however,  the  distribution  of  the  test  statistics 
concerning  these  three  hypotheses  are  not  known,  and  so  the 
problem  of  finding  percentage  points  of  these  criteria  has 
thus  become  difficult.  The  aim  of  this  thesis  is  to: 

1.  Derive  the  sampling  distribution  of  the  test 
statistic  for  each  of  the  three  hypotheses. 

2.  Prepare  tables  of  percentage  points  for 
a  =  .01,  .05  and  for  N  =  3  to  100,  k  =  2(1)6. 

3.  Derive  an  asymptotic  expansion  of  the  distribu¬ 


tions  which  are  valid  for  moderately  large  values  of  N. 


4.  Illustrate  the  results  obtained  in  this  thesis 
by  applying  it  to  actual  data. 

Chapters  II,  III,  and  IV  provide  preliminary  infor¬ 
mation  necessary  for  the  derivation  of  the  sampling  dis¬ 
tributions  of  the  three  hypotheses.  In  Chapter  V  the 
actual  derivation  of  the  sampling  distribution  is  under¬ 
taken  for  each  hypothesis  from  which  tables  of  significant 
levels  are  obtained.  Chapter  VI  gives  an  approximation 
method  valid  for  moderate  values  of  N.  Chapter  VII  uses 
actual  data  submitted  by  the  Engineering  Division  of  the 
Air  Force  Mater iels  Laboratory,  Wright-Patterson  AFB ,  Ohio, 
to  demonstrate  the  practical  application  of  the  theoretical 
results . 


II.  Statistical  Preliminaries 


Multivariate  Normal 
Distribution 


Let  the  vector  x  have  p-components,  i.e.,  x  = 

(Xl,x2 . xp>,  then  x  has  a  p-variate  non-singular  normal 

distribution  if  its  p.d.f.  is 


1 

E 

(  2tt  )  2 1 1.  j 35 


exp  { 


-  (x  - 


u)  •E“1(x  -  U) 


(2.1) 


y  and  are  the  parameters  of  the  distribution; 
is  a  column  vector  of  elements  y^  (i  =  l,2,...,p)  and 
Z  -  (0ijl  is  a  positive  definite  symmetric  matrix  of  order 
p.  The  p.d.f.  (2.1)  will  be  denoted  by  Np(x  |y,£_)  and  the 
notation  x  ~  N  (x|y,E)  will  be  used  to  indicate  that  the 

-  p - 

variates  x  have  a  p-variate  non- singular  normal  distribu¬ 
tion  with  parameters  jj  and  E..  When  Z_  is  a  diagonal  matrix, 
then  f(x)  is  the  product  of  the  p.d.f. s'  of  p  univariate 
normal  variates,  showing  that  the  x's  are  independently  dis¬ 
tributed  in  that  case. 


When  p  =  2,  f (x)  has  the  following  bivariate  normal 


p.d.f. 


f(x)  =  - —  re  exp 

( 2tt )  UP 


{-  ~  (x  -  y)  '  E-1  (x  -  y)} . 
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Let  x  be  a  random  sample  of  size  N  from  a  distribu¬ 
tion  with  p.d.f.  The  vector  x  can  be  repre¬ 

sented  as  the  p  by  N  matrix 


X11X12  X1N 

X21X22  ***  X2N 

X  .X  0  ...  X  „ 

_pl  p2  pN_ 


The  columns  of  x  are  independently  and  identically 
distributed  as  (xj  ]!,£)•  Thus  the  p.d.f.  of  x  is  the 
product  of  the  p.d.f.'s  of  the  N  columns  of  x. 


fU)  =  - ^  exp  {-  ^  E_1(x  -  y)  (x  -  R)'} 

(2,)  2  III"  <2’2> 

where  tr  is  the  trace  of  a  matrix. 

The  exponential  term  of  this  p.d.f.  is  obtained  by 
using  the  following  property  concerning  the  trace  of  the 
product  of  matrices. 

Let  P,  Q,  and  R  be  matrices  such  that  the  product 
PQR  exist.  Then 

tr(PQR)  =  tr(RPQ)  =  tr(QRP)  =  tr(QRP) 

Specifically,  since  (x  -  jj )  ' ,  E.”1,  (x  -  ji)  are 
matrices  whose  product  exists. 
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tr  (x  -  y )  '  E-1  (x  -  y )  =  tr  (x  -  y)  (x  -  y)  'E__1 

=  tr£-1 (x  -  y ) (x  -  y ) '  (2.3) 


Maximum  Likelihood  Estimates 
(MLE)  of  y  and  £ 

Let  x  be  a  random  sample  of  N  observations,  where 
x  ~  Nptx]^,^),  N  >  p.  The  likelihood  function  of  x  is 


L 


- gf - g  exP  {‘  I  r  <Sa 

2  2  a=1 

(2t r)  |Z| 


(2.4) 


To  find  the  MLE  of  jj  and  £  it  is  necessary  to 
maximize  the  likelihood  function  L.  Since  the  likelihood 
function  L  and  its  logarithm  are  maximized  for  the  same 
value  we  will  consider  log  L. 


log  L  =  ^-pN  log  (  2tt  )  +  log!  1 


1  N  -1 
4  2  (x  -  M)  '  ^  (x  -  i j) 

2  ,  — a.  —  —a 

a=l 


(2.5) 


The  following  properties  will  enable  us  to  rewrite 
equation  (2.5)  in  a  form  which  is  easily  maximized. 

Definition  1:  Let  the  sample  mean  be  defined 

as: 


12 


-v  - 


X 


1 

N 


N 

£ 

a=l 


x 


a 


n  ?  1 

_ 

-  E  x. 

N  0=1  la 

• 

*1 

• 

• 

1  N 

• 

X 

1 

E  X 

-p 

N  o=l  pa 
L  J 

(2.6) 


Definition  2:  The  matrix  of  sums  of  squares  and 
cross  products  of  deviations  about  the  mean  is  defined  as 
A  where 


N 

A  =  E  (X  -x)  (xa  -  x)  '  (2.7) 

a=l 

Lemma  2.1:  Let  x^,  ...  be  N  p-component 

vectors  and  let  x  be  defined  by  definition  1.  Then  for  any 
vector  b 


N  N 

1  (xa  “  ^  ~  b)  '  =  1  (3L.  "  x)  (x  -  x)  ' 

a=i  u  a=l 

+  N(x  -  b) (x  -  b) '  (2.8) 

proof:  (Ref  1:46) 

Lemma  2.2:  Let  f (C)  =  N  log  |c|  -  tr  CD  where  C  = 
(c^j)  and  D  =(d^j)  are  both  positive  semi-definite.  Then 
the  maximum  of  f (C)  is  taken  at 

C  =  N  D-1 
proof:  (Ref  1:47) 
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Now,  by  using  the  property  tr(a)  =  a  where  a  is  a 
scaler  and  applying  equation  (2.3),  we  can  rewrite  equation 
(2.5)  in  the  following  form: 

N  N 

Z  (x  -  y )  '  Z  x  (x  -  y)  =  tr  Z  Z  (x 
a=l  ^  “  o-l"  ^ 

By  using  lemma  2.1  and  setting  b  =  £  (2.8)  becomes 
=  tr  E  +  trN  (x  -  y_)  '  £_  1  (  x  -  £) 

Thus 

^  _] 

Z  (x  -  li)  '  Z  (x  -  y_)  =  tr  Z_  A 

a=l  ““  -* 

+  tr  N (x^  “  K.)  '  5.  ^  (x  -  £)  (2.10) 

Now  substituting  the  RHS  of  (2.10)  in  equation  (2.5) 
gives  us  a  form  that  is  easy  to  maximize. 

log  L  =  |pN  log  (  2tt  )  +  |log  Z_1  -  |tr  Z-1  A 

-  |n  (x  -  ^) '  Z_1(x  -  £)  (2.11) 

The  first  term  of  (2.11)  is  a  constant  and  is  there¬ 
fore  already  at  its  maximum  value.  The  last  term  is  at 
its  maximum  value  of  zero  when  =  x.  Since  the  remaining 

A  _ 

terms  are  not  functions  of  £,  the  MLE  of  £  denoted  £  is  x. 

To  find  the  MLE  for  Z_-^  notice  that  the  second  and 
third  term  of  (2.11)  are  functions  of  Z_1  alone  and 


-  y)  (x^,  -  y )  ' 
(2.9) 


therefore  can  be  maximized  by  applying  lemma  2.2,  putting 
L  *  for  C  and  A  for  D.  Thus,  the  maximum  of  log  L  occurs 
when 


To  summarize,  the  MLE  for  V  and  _E  are 

i-|-  (2.12) 

The  Wishart  Distribution 

Let  x  =  (x. ,  x_,  . ..,  x„)  be  a  random  sample  of 
—  —1  —2  — N 

size  N  from  N  (x  |0, E) .  The  Wishart  matrix  A  is  defined 
P -  - 

as  the  p  x  p  symmetric  matrix  of  sums  of  squares  and  sums 
of  products  of  the  sample  observations. 

Let 

A  =  I  <*„  -  X) <£a  -  X) ■ 
a=l 

Then  it  is  known  that  A  has  the  following  p.d.f.  (known  as 
the  Wishart  Distribution  with  n  =  N  -  1  degrees  of  freedom) . 
(Ref  1:54) 

_  P.  t_JL)  . 

f  (A)  =  K  (_E,  n)  |  A  |  2  2  exp  |-~tr  _E_1  A-  (2.13) 

where  A  >0  (i.e.,  A  is  positive  definite)  and 


K  {£  ,  n) 


1 


np  n 

:2rp|§||J|2 

The  distribution  of  A  is  W  (A|l_,n)  where  n  represents 
the  degrees  of  freedom,  note:  When  p  =  1,  the  Wishart  dis¬ 
tribution  is  a  Chi-Square  with  n  degrees  of  freedom. 

Theorems  Concerning  the 
Wishart  Distribution 

The  following  theorems  are  necessary  in  order  to 
derive  the  results  in  later  chapters. 


Theorem  2.1:  If  A-^A^,...,^  are  matrices,  each 
independently  distributed  as  W(A  |E,n  )  then 


g 

A  ■  Z  A. 
i=l  1 

g 

is  distributed  W(A|z^,  l  n.) 

i=l  1 


proof :  (Ref  1:162) 


Theorem  2.2:  Let  A  and  T  both  be  p  by  p  positive 
matrices,  then 


dA  =  K-1(T,a) 
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where 


K(T,a)  = 


a 

22V 

p  \  2 


(P  -  1) 
4 


P 

n  r 
i=l 


(a  +  1 
2 


This  result  follows  directly  from  the  Wishart 
p.d.f.  since  multiplying  both  sides  by  K(T,ot)  gives  the 
Wishart p .  d .  f .  Thus,  if  we  denote  the  function  under  the 
integral  as  f (A)  we  have 


K(T,a)  J  f  (A)  dA  =  1 
A>0 


Theorem  2.3:  Let  A  ~  W  (A|£,n)  then 


.  ,h  K(S,n) 

E(W  1  "  KT£Tn+~2h~) 


where  K(l,n)  is  defined  as  in  theorem  2.2, 


Proof : 


E(Ah) 


-  / 


-  l 


J  I A  I  f  (A)  dA  =  J  |Ai  K(l,n) 
A>0  A>0  1 


H  _  P  +  L  , 

2  lex p  j-  -^tr  Z  *A  dA 


=  K( 


f  (h  +  S-^l  /  i  -i  \ 

f,n)  J  |  A  |  /exp  - -~tr  A  dA 

A  >0  1  1  I 
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Suppose  co  C  where  £  u »  then  the  likelihood 

function  for  u>  is 


L(co) 


1 


pN  N 

( 2  ir )  2  |£|  2 


1  N 

exp{-  ^  E  (Xg  -  n)  ' 
a=l 


£"1(xa  -E>> 

(2.15) 


Denote  the  maximum  of  L(Q)  in  0  by  L(fi)  and  denote 
the  maximum  L  ( oj  )  in  co  by  L(to)  .  Then  the  likelihood  ratio 
criteria  is 


max  L  (co ) 

CO _ 

max^  L(.Q) 


L  (co ) 
L  (f; ) 


(2.16) 


The  statistic  A  will  be  our  criteria  for  hypotheses 
testing.  As  explained  earlier  the  value  of  X  is  between 
0  and  1,  and  a  small  ratio  of  X  leads  to  rejecting  our 
hypothesis  where  a  ratio  near  unity  gives  strong  support 
for  rot  rejecting. 

The  subject  of  the  next  chapter  is  to  determine 
the  criterion  X  for  the  three  separate  hypotheses. 
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III.  Derivation  of  the  Criteria 


Let  x^(g  =  l,2,...,k)  be  each  independent  and 
identically  distributed  as  N  (x  Syg,£).  In  this  chapter 
we  shall  derive  the  likelihood  ratio  criteria  for  the  fol¬ 
lowing  hypotheses: 

[i]  Hn:  y1  =  y2,  =  ...  =  yk 


v  =  E 

-1  -2, 


=  *k 


(3.1) 


[  i  i  ]  H  i :  “  ^.2 '  ~  •  •  •  —  ^ 


12  k 

[  iii]  :  jj  =  _y  /  =  •••  _y  given  £  ^  •••  — 

Let  xg  be  a  random  sample  of  size  Ng  from  the  g 
population,  then  using  definition  (2.6),  the  mean  x^  of  the 
g  population  is  denoted  by 


-g  _  _1_  y  Q 

X  -  - ,  Cj  X 

Ng  c-1  “ 


Let  the  combined  mean  x  of  all  the  sample  populations  be 


denoted  by 


k  Ng*5  k  N 

X  =  Z  -£  -  £  N 

9-1  E  N  9  1 


•V 


where 


N  =  IN 
g=l  9 


Let  denote  the  matrix  of  sums  of  squares  and 
products  for  the  gfc^  population,  as  defined  in  (2.7);  i.e.. 


N 


A  = 

-g 


Z  (x^  -  x9)  (x^  -  g9)'  g  =  1,2, ...,k  (3.2) 


a=l 


—a 


Likelihood  Estimates  of 

p9  and  Z 
“ _ -g 

The  k  populations  are  independent;  therefore,  the 
likelihood  function  of  all  the  sample  observations  is  the 
product  of  the  separate  likelihood  functions;  so  general¬ 
izing  the  results  in  Chapter  XI  to  k  populations  the  MLE 
for  p9  and  I  become 


The  likelihood  function  for  our  parameter  space  ft 

12k  i 

=  (p  ,  p  ,  p  ,  Zx ,  Z2,  . ..,  Z^)  ,  where  (-  00  <  p  <  °°)  , 

and  (0  <  Z_.j_  <  °°)  ,  i  =  1,2,  .  .  .  ,k  is 


L(H) 


pN  N  exp 

k  V  -SL 
n  (2tt)  2  l£i  2 

g=l 


k 

E  tr 
g=l 


A 

-g 


1 

2 
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)  . 


Taking  the  logarithm  of  L(^)  gives 


k  k  N 

log  L(Q)  =  I  (-^log(27i)  +  I  -3-  log|l-1| 
g=l  *  g=l  ^ 


1  -1 

2  ^  *9 

g=l 


N  (x^  -  ug)  '  i"1!^  - 


2tr  £  ■ Ng<£--M-.  ^ 

g=l  ’ 


k9) 


-1 

Setting  IN  =  N  and  bringing  the  exponent  of  I  m  front 
g=l  9 


of  the  summation  gives 


log  L(f2) 


=  -  ^  log  (2tt)  - 


1 

2 


Z , Ng I  — g 1 

g=l  y  ^ 


k 

I  tr 
g=l 


A 

-9 


,  k  _ 

-  4  tr  I  N  (xg  - 
l  ,  q  — 

g=l  * 


r1^  - 

-g  - 


(3.3) 


The  last  term  on  the  RHS  has  its  maximum  value  when 

jr*  =  xg  and  thus  substituting  this,  brings  the  term  to  zero. 

Also,  substituting  the  MLE  of  I  in  equation  (3.3)  gives 

9 

the  maximum  value  of  HQ).  Thus 


22 


=  -  log  ( 2 tt  ) 


log  L (f2) 


~i  \*g 

g=l  y 


4  I  tr(N  A  "1)A 
2  g=l  9  9  "2 


-1 

The  last  term  becomes  E  N  tr  I .  Since  tr  I 

2  ,  q 

g=l  y 


=  P 


E  N  =  N  we  have 

g=i  g 


N 

_  A 

log  L(ft)  =  log(27T)_  2  +  E  log  1^1 

g=l  g 


pN  k  ~  _ 

—  * —  _  _  i  — rr  i  Z 


+  log  {exp(-  ^ )} 


N 


_EN  /  k  a  12 

=  log  (  2tt  )  2  log  n  |=9-J  2 

W“1  9 


+  log  {exp  (-  )  ) 


=  log 


<  2ir) 


N 

‘if  k  A -•# 

2  ",  l?l 

g=l  g 


exp (- ^ 


-f  k 


N 
-_2 


L(fi)  =  (  2tt  )  2  n  ,^gi  2  exp(-E^) 

q=l'N  1  2 


Derivation  of  Criteria  for  Hq 
To  test  the  hypothesis 
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and 


•J 

(3.4) 


substitute  parameters  y 9  =  y  and  £_^  =  £_  into  the  likeli¬ 
hood  function.  Thus,  equation  (3.3)  becomes 


k  k 

log  L  (u)n)  =  -  log  (  2tt  )  -  4  £  N  log  |  E  |  -  4  E  tr  E  ~^A 
0  2  2  g=l  9  2  g=l  ^ 


1  K  .  _ 

-4  E  tr  E  N  (x9  -  y)  (x9  -  u)* 


g=l 


g  - 


k  k 

Now,  substituting  IN  =  N  and  £  A  =  A  gives 

g=l  9  g=l~^ 


log  L(Wq)  =  -^log(27T)  -  j  log  j  Ej  -  ^  tr  I  aa 


N 


1  .-1, 


|tr  E'1 


E  N  (x9  -  ]J )  (x9  -  u)  ' 
g=l  9 


(3.5) 


Combining  the  last  two  terms  and  applying  lemma  2.1 
with  b  =  ji,  equation  (3.5),  can  be  written  as 


log  L(ojq)  =  -^log(27r)  -  ^  log  |E_|  -  ^  tr  1 


A  +  £  N  (x9  -  b)  (x9  -  b )' 

“  9=1  9  "  "  " 


(3.6) 


The  term  in  the  bracket  becomes 


A  +  £  N  (x9  -  x)  (x9  -  x)  ’  +  N (x  -  ]±)  (x  -  y)  ’  (3.7) 

g=l  9 


Thus,  the  log  L(wg)  is  maximized  with  ]£  55  x. 
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For  convenience  we  will  denote 


l  N  &  -  x)  (jT3 
9=1  9  '  -  - 


x) '  as  B. 


To  find  the  estimate  for  Z_  it  is  necessary  to  maxi¬ 
mize  equation  (3.5)  with  respect  to  E 


f  (E_1)  =  log  (2tt)  +  |  log|E_1|  -  |  tr  Z  ^  (A  +  B) 

(3.8) 

Applying  lemma  2.2  on  the  last  two  terms  with  C  =  Z_  ^  and 
A  +  B  =  D  the  maximum  of  f  (1^  ^)  is  at  C=  N  D  1  or  £_  ^  = 

N(A+B)  Thus,  the  estimate  for  £_  is 

£  =  *  +  ° 

N 

/v 

Now,  substituting  l_  for  !_  in  equation  (3.8)  gives 


L<“o> 


=  -  ^  log  ( 2ir )  -  |  log  A  tl 
2  2  N 


-i  tr  ItU 
2  N 


-1 


(A  +  B) 


=  ( 2rr ) 


pN 

“2  .A  +  B 


N 


.  PN 
exp  \-*y 


(3.9) 


Therefore,  the  criteria  Xg,  for  testing  hypothesis  Hg  is 
the  likelihood  ratio 


X0 


L(Ug) 

L  (  Q  ) 


k 

n  ■ 

9=1 


i  EM 

2  N2 


^k 

|a+b|2  n 
9=1 


N 


(3.10) 
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Derivation  of  Criteria  for  H, 


To  test  the  hypothesis 


:  £^  =  — 2  =  '**  =  — k  t*ie  means  anY  value 
our  parameter  space  belongs  to  ft,  where  (0  <  £^  =  £^  = 
...  =  £^  <  00 )  .  Thus  we  substitute  £  for  £^  in  equation 
(3.3)  to  get 


K  K 

log  L^)  =  -^log(2TT)  -  i  E  Ng  log|£|  - 1  I  tr  £_1  A 


-~tr  IN  (x^  -  y ) g)  1  I"1  (xg  -  yg 


-g 


(3.11) 


The  last  term  is  maximized  when  yg  =  xg.  By  sub- 


k  k 

stituting  In  =  N  and  A  =  I  A  and  bringing  the  trace  in 
g=l  9  g~l ~g 


front  of  the  summation  we  have 


log  L(w^)  =  -^log(27T)  -  ^  log  |£|  -  ^  tr  £  1  A 

Now  using  lemma  2.2,  we  have  £  ^  =  N  A  ^  or  £ 
A 

Using  —  for  I  the  maximized  function  becomes 
N  ~ 

log  L(w  )  =  log  (2tt)  -  ^  log  |  JLl  -  j  tr  A]  1  A 

1  1  2  N  2  Isf/ 


=  -^log(2ir)  -  ^  log  |JLj  -  4  tr 


Using  tr  I  =  p  and  rewritting  gives 

_  pN  _  N 

log  L (oj )  =  log(27T)  2  +  log  [Jr.]  2exp(-^) 

N  z 


Therefore,  using  exponentiation  gives 


pN  N 

2L4|'?exp(-EH) 


(w ,  )  =  (2tt)  )Aj  exp l 

1  N 


(3.12) 


The  likelihood  ratio  criteria  A^  for  testing  the 
hypothesis  H.  is  given  by 


*JL  EM 

,  *  l^ql  2  ”2 _ 

A 1  ~  “  nN 

UQ)  g=l  N  .  P  q 

2  *  2 
1*1  n  n 
g=l  y 


(3.13) 


Derivation  of  Criteria  for  H, 


To  test  hypothesis 

12  k 

H2:  p  =  y  =  ...  p  given  that  =  Z.2  •  •  • 

“  note  that  the  log  of  the  likelihood  function  equation 

(3.3)  with  the  condition  =  — 2  =  •••  =  imposed  is  the 

same  as  log  L(w^),  equation  (3.12).  Further,  when  the 
12  k 

restriction  jj  =  p  =  .  .  .  jj  is  imposed  the  log  Mu^)  is 
the  same  as  log  L(wq)  given  in  equation  (3.9).  Thus,  the 
criteria  X^t  can  be  represented  as  the  ratio  of  these  two 


equations : 


^  [a 1 2  _  1 A | 


(3.14) 


L(wx) 


IMP.  | 


.V 


Particular  Cases 


In  this  thesis  we  will  be  concerned  with  equal 

sample  sizes  and  p  =  2.  Thus,  when  =  N2  =  •  •  •  =  n 

and  N  =  kn  the  three  criteria  take  the  form: 

n 


An  =  k 


kn 


k 

n 

g=l 


A 


A+B 


(3.15) 


X 


1 


n 


X 


2 


kn 


A+B 


(3.16) 


(3.17) 


In  this  chapter  we  derived  the  likelihood  ratio 
criteria  to  test  each  of  the  three  hypotheses.  Next,  we 
must  determine  a  critical  region  for  testing  i  =  0,1,2. 
Our  critical  region  is  the  set  defined  by  (0  <  X  <  <J>)  and 
our  decision  rule  is  to  reject  if  X  <  <j>.  The  function 
X  defines  a  random  variable  Xfx^,  x^,  ...,  x^)  and  the 
significance  level  of  the  test  is  given  by 

a  =  Pr[X  <  <£;  H^] 

We  determine  these  probabilities  by  finding  the 
sampling  distribution  of  our  likelihood  ratio  criteria 
Xq ,  Xx,  and  X2- 
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In  the  next  chapter  we  will  obtain  the  moments  of 
the  criteria  from  which  the  sampling  distributions  will  be 
derived  in  Chapter  V. 


IV.  Derivation  of  the  Moments 


To  obtain  the  sampling  distributions  of  the  cri- 

th 

teria  A  ^ ,  A^  and  A  2,  we  need  to  obtain  their  hi  moments 
which  are  derived  below. 

1.1. 

The  ti  Moment  of  A^ 
th 

The  h  moment  of  A^  is 


where 


E(x£)  =  K0E(Lh0) 


kNh 


K.  = 


N 


n  n 
g=i  g 


N  h 
_2_ 
2 


and 


k 

n  ■ 

g=l 


(4.1) 


By  theorem  2.4,  A  ~  W(A  £,n  ).  The  matrix  B 

r  -g  _gl_  g 

is  the  sum  of  squares  and  sum  of  products  between  means 
for  k  samples,  thus  B  ~  W(B|E_,k-l).  Theorem  3.3.2  (Ref 
1:53)  establishes  the  independence  of  the  sample  means  and 

k  _  _ 

covariance  matrices  and  since  B  =  E  N  (x^  -  x) (x^  -  x) ' 

“  g-l  9  “  ~  “ 

it  follows  that  A  and  B  are  independent. 

Since  L.  is  a  function  of  A  and  B  we  have  by 
0  -^g  - 


definition 


Ell*) 


/  I, 

A  >  0 , B>  0  9  1 


N  h 

g 

(A  |  2  k 

— - nK  «  W(A^|^,ng)W(B|E,k-l)dAgdB 

g— 1  J 

IA+BI 


=  n  K(s,n  ) 


A  >0 ,B>0 
-g 


Nh  V+nc 

a+b|  2  n  | A  |  2 

g=l  9 


exp(-^tr  Z  ]A  )  W  (B  I  E  ,k-l)  dA  dB 
2  -  -g  g 


k  K(Z,n  )  f  k 

‘  A  j  i't2! 

9  g  9  A  >0 , B>0  9 


W(B| E,k-l)dAg  dB 


(4.2) 


The  integral  on  the  right  hand  side  of  (4.2)  is 


equal  to  E(|A+B[  j  since  A  =  A-^  +  A^  +  ...  +  A,  and  B 
are  independent.  Therefore, 


.  k  K(E,n  ) 

E(V  =  11 .  KtT/Nh+n  f 

g=l  -  g  g' 


:(  |A+BJ) 


(4.3) 


Recall  that  A  ~  W{A|l,  Z  (N  h  +  n  ) }  and  B  ~  W(B|^,k-l), 

g=l  9  9 
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so  by  theorem  2.1  (A+B)  _  W{  (A+B)  \  Z,  V-  (N  h  +  n  )  +  k-l}. 

9  9 


Now  n  =N  -land  i  N  =  N.  Therefore, 

g  g  g=i  g 


l  (N  h  +  n  )  +  k  -  1  =  Nh  +  N-  k  +  k-1  =  Nh  +  N-1. 
g=l  g  g 


Hence,  (A+B)  W{  (A+B)  |  , Nh  +  N  -  1 }  .  Thus,  using  theorem 

Nh 

2.3,  substituting  (A+B)  for  A  and  -  y-  for  h  we  have 


E ( | A+B |  £ ) 


K  ( i  ,  Nh  +  N  -  1 ) 


K  ( l ,  Nn  +  N  -  1 ) 


K(l,Nh  +  N  -  1  +  2(-^y  ))  K(-'  N  1  ) 


(4.4) 


From  equations  (4.3)  and  (4.4)  we  have 


k  K(l,n  ) 

E(A  )  =  K  n  - 2 - 

g=l  K(Z,N  h  +  n  ) 

-  g  g 


K(£,Nh  +  N  -  1) 
K(E,N  -  1) 


(4.5) 


Using  the  definition  T  (^)  and  K(^,n)  from  equa- 

P  ^ 


tion  (2.13)  we  have 


E(X*) 


N  h  +  n 


N  ^  ^  1  v  2  ' 

kN  h  . 11 .  n.  n  +~r~T 
k  _.a_  1=1  g=1r(-a_ - j 

n  n  2  2 

g=l  g 


,  ,Nh  +  N  -  i. 


(4.6) 


For  the  particular  case  of  p  =  2,  =  N2/  =...,“  N^  =  n, 

N  =  nk  and  n  =N  -  1  =  n  -  1  equation  (4.6)  becomes 

g  g 


E(XJ) 


(nk)knh  k  rnnH-->  IM-y— ) 
knh  k  r  ,n  -  1,  ,,  .  n  -  2  . 

n  9  ^  L  r  (  2  )  i  (  2  * 


r(S^)  1(2^) 

p  (Hkh  +  kn  -  1  j  p  ^nkh  +  kn  -  2  ^ 


.  kknh  Ir{n(h2+lt  -|)  -  D)k 

IT  r  (S-^l  )k;.{nlL(hjt_li  -  i) 


r(2Sfi)  r  (-^-2) 

Tpamrnr 


(4.7) 


Now  applying  Gauss's  multiplication  formula  (Ref  9:11) 


_(mz-T)  m-1  . 

r  (mz)  =  s — — n  r(z  +  ^) 
Sl^l  i=0  m 

(  2tt  ) 


(4.7a) 


with  m  =  2  we  can  rewrite  equation  (4.7)  in  the  following 
simpler  form. 


EUh}  =  Rknh  r  (kn  -  2)  [T{n(h  +  1)  -  2}  ]* 
0  [T  (n  -  2)  )k  r{kn  (h  +  1)  -  2} 


(4.8) 
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The  h  Moment  of  A. 


The  h  moment  of  A^  is 


EUJ)  =  K1  •  E(lJ) 


(4.9) 


where 


K1  = 


and  L,  = 


k  I  A 

n 

g=l 


n  n 
g=i  g 


Since  L.  is  a  function  of  A  and  A  -  W(A  |£,n  )  we  have 
1  —q  — g  —g —  g 

by  definition 


N  h 

t~  k  g  _  Nh  j. 

Jl)  =  J  1  ?  ^  2  >  ^  2  5  w<AgU,ng)dAr..dAk 

A  >0  g"1  g_1 

~g 


,  Nh  , 

=  n  K(X,n  )  /  |A|  2  JI  |  A 

q=l  9  .  J  q=l  ~ 1 


N  h  +  n^  ,  , 

/  g  J — 2  _  rjlL\ 

'  o  o  > 


A  >0 

-g 


exp  ( -  --  tr  l  dA^ .  .  .  dA^ 


k  K(Z,n  ) 

-  ji  - - — —2— 

,  K  (Z  ,N  h  +  n 
q=l  -  g  g 


)  f  -  ~  k 

- r  /  I A  |  2  II  W(A  j  Z ,  N  h  +  n  ) 

V  J  —q  g  g 


A  >0 


dA.  .  .  .  dA. 
1  k 


(4.10) 


The  integral  on  the  right  hand  side  of  (4.10)  is 

_  Nh 

*”  2 

equal  to  E(|a|  ),  since  A  =  A^  +  Aj  +  ...  +  A^. 


KdC.,n  ) 


E(V  =  A 


Nh 

2 


‘  e(|a|  ) 


(4.11) 


Nh 


And  so  by  theorem  2.3,  substituting  — y for  h  and  recalling 
A  ~  W(A|Nh  +  N-k)  we  have 


Nh 


E  ( |  A|  2)  = 


K  (  Z,  Nh+  N  -  k) 


K (I ,Nh  +  N  -  k) 


Nh, 


(4.12) 


K(Z,Nh  +  N -k  +  2(- y^)  K(Z,N-k) 


Thus 


E(A“)  =  K1 


'  k  K(E,  nq) 

nn  K  (Z  ,  N  h  +  n  ) 
g=l  g  g 


K  (E_, Nh  +  N  -  k) 
K(I,N  -  k) 


(4.13) 


Using  the  definition  of  r  (j)  and  K(Z,n)  from  equation  (2.13) 
we  have 


P 

n  +  N  h 


e(a“)  =  kx  n  n 


p  k  r  +  i-=-i)  r  (^.K±1~A) 


n  +  1  -  i 


i=i  g-i  r  (-3  r  (w-- K.t  \  r  1 1 


where  = 


N 


pNh 

2 


k  ?Nqh 

n  n  2 
=1  g 


g=l 


For  the  particular  case  of  p  =  2,  =  N2  =  . . . 


=  n,  N  =  nk  and  ng  =  Ng  -  1  =  n  -  1 ,  we  have 
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r  i k  (n  -  1 )  ;  rr k(n  -  1)  1 

E(xJ)  =  kknh  —  — - 2 - H 

ir(  )  r(— “)  ]k 


[r{nli^JlI  .  1}  r{n_(l+JLL  .  1}]k 

r  ^nk ( 1+h)  _  l_j  r ^nk (1+h)  _  k+1 ^ 


(4.15) 


4“  v> 

The  fi  Moment  of  X. 

th 


The  h  moment  of  X^  obtained  by  the  same  method 


used  to  derive  Xq.  Thus 


E<*2>  =  E(L*>  where  L2  = 


N 

\2 


(4.16) 


Since  A  =  £  A  and  each  A  ~  W(A  |E,n  )  we  have  by 

-  g=i~g  -  -g'-  g  * 

k  k  k 

theorem  2.1  A  ~  W(a|E,  I  n  ).  Now  Z  n  =  Z  (N  -1) 

—  — '  —  rr  rr  rr 


g 


g=l 


=1  g  g=l 


=  N  -  k  so  A  -  W ( A |  E^,N  -  k )  .  From  the  discussion  of  the 
derivation  of  XQ  we  know  that  B  ~  W(b| k  -  1)  and  A  and  B 

I.L. 

are  independent.  Since  L2  is  a  function  of  A  and  B  the  h 
moment  of  X^  is  by  definition 

r  Nh  _  Nh  (N  -  k  _  p  +  1  j 

E(L^)  =  J  |A|  2  |A+B|  2  K(Z,N-k)|A|  2  2 

A>  0 , B >  0 

•  exp(-^tr  l  ^A)  f  (3)  dA  dB 
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K(E,N-k)  J  |A+B| 
A>  0 , B>  0 


/ 


Nh 

2 


.Nh  +  N  -  k. 

'  o  < 


_  pjb.1. 


1*1 


exp(--^tr  E  1A)f(B)dAdB 


K(Z,N  -  k) 
K(£ ,Nh+N-k) 


A>  0  ,  B>  0 


Nh 

2  W(A|  E ,Nh+N-k)  f  (B)dA  dB 

(4.17) 


Recall  that  (A+B)  -  w( (A+B) | E,Nh + N  -  1} .  Therefore,  we 
have 

Nh 

,  K(E,N-k)  -~ 

E(L^)  =  -  •  E  (  |  A+B  |  )  (4.18) 

2  K(E,Nh  +  N-k) 


Now,  applying  theorem  2.3,  substituting  A  +  B  for  A  and 

vrU 

-  ^  for  h  we  have 


_  Nh 

E ( | A+B |  2 ) 


K(E,Nh  +  N  -  1) 
K(E,Nh  +  N  -  1  +  2(  -  -j-)  ) 


K(E,nh  +  N  -  1) 


K(E,N  -  1) 

(4.19) 


Combining  (4.18)  and  (4.19)  we  have 

K(Z,N-k)  K(E,Nh  +  N-l) 

E(A“)  - - - - 

2  K(Z,Nh  +  N  -  k)  K(E,  N  -  1) 


(4.20) 


Using  the  definition  of  r  (^)  and  K(Z,n)  for  equation  (2.13) 

P  ^ 

we  have 
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(4.21) 


...  ,N  -  k  +  1  -  i  .  „  ,N  -  it 

h  pi  ( - 2~ -  +  h)  r  ( — ~— ) 

E(A2)  =  /=1  “(EHLii,  r(^+h) 


For  the  particular  case  of  p  =  2,  N-^  =  N2  = 

N  =  nk,  and  n  =  N  -  1  =  n  -  1,  we  have 
9  9 


h  r(!s2fil 

(X2’  "  r[K(n  -  l)]r,k(n  -  1)  I 


N, 


=  n 


r  j-kn(l  +  h) 
r{kn(l_+_h> 


k,  r  r  kn  (1  +  h) 
2*  1  ^  2 
1-,  r  ,  kn  (1  -Th) 
2'  11  2 


k  +  1. 
2  * 


1} 


(4.22) 


Now,  applying  Gauss's  multiplication  formula  (4.7a)  with 
m  =  2  we  have 


Phh»  _  r  (kn  -2)  F{kn(l  +  h)  -  k  -  1} 
ma2'  T  (kn-k-l)r(kn(l  +  h)  -  2} 

th 

In  this  chapter  we  obtained  the  li  moment  of  our 
test  criteria  A ^ ,  i  =  0,1,2,  making  extensive  use  of  the 
Wishart  distribution  and  its  properties.  The  moments  were 
obtained  for  the  general  case  of  p-variables  and  sample 
sizes  not  necessarily  equal.  We  then  restricted  the  moments 
to  p  =  2  variables  and  equal  sample  sizes.  Thus,  we  have 
the  moments  of  our  criteria  from  which  we  will  obtain  their 
sampling  distributions  in  the  next  chapter. 


•■v 


■’V'  - 
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V.  Distribution  of  the  Criteria 


In  the  literature  S.  S.  Wilks  (Ref  17:60)  and  others 
have  proven  that  the  distribution  of  -2  log  A,  where  X  is 
the  likelihood  ratio  criteria,  approaches  the  Chi-Square 
distribution  with  r  degrees  of  freedom  (r  is  the  number  of 
linear  independent  restrictions  imposed  on  the  null  hypo¬ 
thesis)  ,  as  the  sample  size  n  approaches  infinity.  In 
this  chapter  we  proceed  to  obtain  the  sampling  distribution 
of  -2  log  A^,  i  =  0,1,2,  by  inverting  their  characteristic 
functions . 
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Since  (5.1)  holds  for  any  complex  number  h,  we 


have  substituting  -2it  for  h  in  equation  (5.1) 


U). 


(t)  =  K  kkn(  2lt)  [find  -  2it)}-  2)]k 


I'  {  nk  (1  -  2 it)  -  27 


(5.3) 


where 


K  = 


r (nk  -  2) 


ir  (n  -  2)] 


and  therefore 


log  <J>  (t)  =  log  K  -(2knit)  log  k  +  k  log 


CJ 


0 


[r{n(l -2it)  -  2)]  -  log[r(nk(l  -  2it)-2}; 

(5.4) 

The  expansion  of  log  4>  (t)  will  be  based  on  the 

u0 

following  expansion  (Ref  1:204): 

log  T  (x  +  h)  =  ~  log  (2tt)  +  (x  +  h  -  ^)  logx-x 


r  Br+1(h) 

_  £  (-]. )  r  - r  —  +  R_  ,  ,  (x)  (5.5) 


m 

l 

r=l 


r  (r  +  1 )  xr  m+1 


where  R  (x)  is  the  remainder  such  that  |R  (x) |<  0  /|xm|, 
0  a  constant  and  Br(h)  is  the  Bernoulli  polynomial  of 
degree  r  order  1  defined  by 


xe 


hr 


x  — 1 


r=0 


t  B  (h ) 


r  l 
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••v 


v- 

* 


Extensive  tables  of  Bernoulli  polynomials  are  available 
in  M.  A.  Fletcher  et  al .  (Ref  4:62-117). 


Applying  the  expansion  in  (5.5)  to  the  gamma  func¬ 
tion  in  (5.4)  we  have 


log  (j>  (t)  =  log  K  -  2  knit  log  k 

^0 


+  k 


|  log(27T)  +  [n  (1  -  2  it )  -  2  -  |] 


log[n(l  -  2it)]  -  n(l  -  2it) 


r=l  r  (r  +  1)  [n  ( 1  -  2it )  ]  } 

|  log  ( 2tt  )  +  [nk  (1  -  2it)  -  2  -  ~] 
log  [nk  (1  -  2it )  ]  -  nk  (1  -  2it) 


m  (-l)r  B  (-2) 

-  X  - - +  R '  (5.6) 

r=l  r  ( r  +  1 )  [nk  (1  -  2it )  ]  r  ) 

Let  T  =  n(l  -  2it)  so  that  kT  =  kn(l  -  2it)  =  kn  -  2knit. 
Substituting  these  results  in  equation  (5.6)  we  have,  after 
some  algebra 


41 


.1 


1 


k  -  1 

log  4>  (t)  =  log  [ K  (  2 7r )  2  k"(kn-5/2)  T-Vj 


m  ( — 1 )  B  ,  i  (_  2) 
+  £  — - r  -1-. - 

r=l  r (r  +  1)  (kT)r 


m  (-1) r  B  ,  i  ( ~  2) 

-  k  E  - -  +  R' 

r=l  r  (r  +  1)T  m+  1 


where  v  =  5/2(k-l) 


Equation  (5.7)  can  be  rewritten  as 


k  -  1 

log  <{>  (t)  =  log  [K  {  2tt  )  2  k“(kn-5/2)  T~Vj 

(Jq 


exp 


1m  A 
Z  - 
r=l  T 


ri  +  R'm+l(x) 


(-1) r  ®r  +  1 ^-2^ 

where  Ar  =  ^TFTTT  - k  Br  +  1  (~2) 


Thus,  from  equation  (5.7)  we  have 


k  -  1 

♦  (t)  =  K(2I)  2  k-(kn-5/2)  T'V 

“0 


Q 

i=0  T1 


E  -4 


+  Rm+1<*> 


(5.7) 


(5.8) 


(5.9) 


(5.10) 
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The  coefficients  are  recursively  computed  using  the 
following  relation  between  and  CK 


i  ,  AkQi - k 

Qi  =  k^k  — I —  '  Qo  =  1 


(5.11) 


(Ref  6:  Chapters  4,  5) 


Recalling  that  T  =  n(l-2it),  we  have  from  (5.9) 
the  characteristic  function  of  as 


<t>  (t) 

w0 


k  -  1  -2  ( v  -t  r ) 

K(2tt)  2  k"  (kn  -  5/2 )  L-  Q  [n(i_2it)}  2 


+  Rm+l(x) 


(5.12) 


(1  -  2it) 


2  ( v  +  r ) 
2 


is  the  characteristic  function  of  a  Chi- 


Square  variable  with  2(v+r)  degrees  of  freedom.  We  have, 
on  inverting  the  characteristic  function  of  in  (5.11), 
the  p.d.f.  of  Wq  is 

f(u)Q)  =  K(2tt)  2  k""  (kn  -  5/2)  z  Q  A,  x2(v+r) 

r=0 


+  rm+i<x| 


(5.13) 


where  QQ  =  1  and  K 


r (nk  -  2) 
(Hn  -  2)]' 


V  =  |(k  -  1) 


Thus  we  have  the  following  theorem: 


Theorem  5.1:  Under  the  null  hypothesis  in  (3.1i), 
the  distribution  of  to^  =  -2  log  X  can  be  represented  as 
the  following  linear  combinations  of  Chi-Square  distribu¬ 
tions  : 


P  (to  Q>  x ) 


k  -  1 

2  k-(kn-5/2) 


Y. 

r=0 


n“ (v  +  r) 


0. 


P,x2(v  +  r)  -  x) 


(5.14) 


where  v  =  5/2(k-l)  and  K  =  •  '  — - . 


(  r  (n  -  2) 


The  Distribution  of  X. 

th 


The  h  moment  of  A ^  is  given  by  equation  (4.21) 


as 


E(A^) 


k 


knh 


[r{|(l+h)- \  }]k[F{?-(l+h)  -  1 }  ] ' 


r{^y(l+h)-  |  } 


r{^(i+h) 


- 1  -  * } 
2  2 


(5.15) 


where 


K, 


r{yn^U)r(MnJLll 


in11-—) 


r(?--^)i 


k 


Let  to,  =  -2  log  A,  and  (t)  be  the  characteristic 

X  X  uJi 

1* 


function  of  to 


Then 


<j>  (t) 

W1 


E  (X _2lt)  =  K1 kkn(  2lt) 


[r{|(i-2it) 

r{^y  (l-2it) 


l'{|(l-2it) 

- 1}  r{~(i-2it) 


-nr 

my 

2  2 


(5.16) 


Taking  the  log  (t)  ,  applying  the  gamma  expansion  formula 

W1 

(5.5)  and  letting  T  =  (1  -  2it)  we  have 


log  4  (t) 

W1 


log  -  2itkn  log  k  +  (a)  +  (b)  +  (c)  +  (d) 

(5.17) 


where 


=  |  log  (2tt)  +  k  [  (T  -  1)  log  T  -  T] 


™  (-l)r  B„  ,  i-h 

k  5  - - - i-— +  k  Ra  (T) 

r~1  r(r+l)T  m  +1 


^  log  (2tt)  +k[(T--|)  log  T  -  T] 


_k  ?  (-D  B  .  (-  1)  , 

k  % - - +  k  Rb  ,  (kT) 

r=1  r  (r  +  1 )  Tr  m+1 


(c)  =  - 


^log(2Tv)  -  [  (kT  -  ^  -  2,x°g  kT  “  kT^ 


+  z  ‘-n  Br+l|-|)  c 

Z, - -  +  R°  ,  .  (kT) 

r  1  r (r  +  1)  (kT) r  m  1 


log 


where 


(d)  =  -  ■-  log  ( 2tt  )  -  [(kT  -|-1)  logkT-kT] 


m  (-DrB  <-  UL  +  .U, 

+  T.  V_V  B_r+1(  2  *  +  d 

r=l  /  m  +  1 

r (r  +  1)  (kT) 


After  some  algebra,  equation  (5.17)  reduces  to 


<b  (t)  =  log  K.  +  (k  -  1)  log  ( 2i; )  +  '  k  ( 1  -  n)  +  tt] 
u)  ^  J-  2 


log  k  +  [— 1 2~ ' k  ^  ]  log  T 


m  A 

+  I  —  +  R '  t  . 
r=l  Tr  ml 


=  log  [Kx (2r)k  ~  1  k  [k  (i  “  r)  +  3/2]  T-v 


exp 


m  A 

Z 

r=l  T1 


+  R’  , 
m  +  1 


5.18) 


,  .  _izIL_ 

r  r  (r  +  1) 


n  /  _  li  i  r  ( _  <JL±_L>  \ 
Brtl(  2  +  Br  +  1  _ 2 _ 


-  k  Br  +  l(-l)  -  k  Br  +  1(-5) 


3(k  -  II 


and 


v 


Thus  we  have  from  (5.18) 


*  (t)  =  K.  {  2tt  >  ^  ~  1  +  V2J  T'V 

00  Q . 
i  -i 

i=0  T1 


where  the  coefficients  can  be  recursively  computed  as 

in  (5.11)  . 

Recalling  that  T  =  ^(1  -  2it) ,  we  have  from  equa¬ 
tion  (5.19)  the  characteristic  function  of  as 


+  R 


m  +  1 


(5.19) 


$  (t)  =  K,  (2;r)k  ~  1  k(k  kn  +  3/2) 

m  ~2 ( v  + r) 

m  j 

Z  Q  [5-(l  -  2it)  ]  ^  +  R  .  (5.20) 

r=0  r  2  m+  1 


-2  ( v  +  r) 

2 

Since  (1  -  2it)  is  the  characteristic  func¬ 

tion  of  a  Chi-Square  density  with  2(v  +  r)  degrees  of 
freedom,  we  have  on  inverting  the  characteristic  function 
of  in  (5.12)  the  p.d.f.  of  co^  as 


f(wl)  =  K1<2Tr>k~1  *(k  kn+3/2) 


m 

I 

r=0 


Vs' 


v  +  r 


2 

k  2  (v  +  r) 


+  R 


m  +  1 


and  thus  we  have  the  following  theorem: 


Theorem  5.2:  Under  the  null  hypothesis  in  (3.1ii) 


the  distribution  of  =  -2  log  X^  can  be  represented 
as  the  following  linear  combinations  of  Chi-Square  dis¬ 
tributions  : 


P  (w  x>  x )  =  K1(2ir)k  1  k(k-kn+3/2) 


a>  „  v  +  r 
Z  (£) 
r=0  n 


QrP(X2(v+r)  >  (5‘21) 


r  { k(n  ~-1) }  r  { k(n  ] \ } 

where  v  =  4(k  -  1)  and  K.  =  - = - x — —  . 

2  1  11(^1  r  nk 


The  Distribution  of  X 


th 


The  h  moment  of  X,  from  equation  (4.23)  is 


h.  _  r  (nk-2 )  r  {  nk  ( 1  +  h )  -  k  -  1} 

2 '  “  I’  (nk  -  k  -  1 )  r  {  nk  (1  +  h)  -  2j 


(5.22) 


Let  L2 


1/N 

2 


where  N 


nk .  We  then  have  from  (5.22) 


e<L2> 


r  (nk  -  2)  r { nk  ( 1  +  — )  -  k  -  1} 
r  (nk  -k  -  l)r{nk(l  +  ~)  -  2} 


r  (nk  -  2)  I  (k  (n  -  1)  -  1  +  h} 
r  (nk  -  k  -T)  I'  (nk  -  2  +"h) 


(5.23) 


Note  that  the  h^^1 


moment  of  a  beta  distribution  is 


r{|(a  +  b)}  T{|(a  +  h)} 
r{|(a  +  b)  +  h]T(|)a 
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with  parameters  a/2  and  b/2  (Ref  1:194). 


Thus  from  (5.23)  we  see  that  L2  has  a  beta  p.d.f. 
with  parameters  N-k-1  and  k-1.  So  we  have  the  follow¬ 
ing  theorem: 

Theorem  5.3:  Under  the  null  hypothesis  in  (3.1iii) 
the  distribution  of  is  given  by 

P(L2  <  x)  =  Ix(N  -k  -  1,  k  -  1)  (5.24) 

where  I  (.,.)  is  the  incomplete  beta  function. 

Numerical  Computations 

The  c.d.f.  of  =  -2  log  A^  (i  =  0,1)  given  in 
equations  (5.14)  and  (5.21)  are  used  to  compute  percentage 
points  of  at  the  level  of  significance  a  =  .01  and 
a  =  .05  with  sample  size  n  from  3  to  100  and  k  =  2(1)6. 

These  are  presented  in  Tables  IV  and  V  in  the  appendix. 
Tables  I,  II  and  III  in  the  appendix  give  the  percentage 
points  of  Lq,  ,  and  l>2  respectively.  The  following 
considerations  are  taken  in  checking  the  accuracy  of  the 
computations  in  the  percentage  points: 

1.  The  integral  over  zero  to  infinity  of  the 
c.d.f.  s  in  (5.14)  and  (5.21)  rapidly  approaches  one. 

'f^kle  5—1  for  k  =  6,  Theorem  5.2,  gives  the  typical 
behavior  of  the  series  as  the  number  of  terms  increases. 

To  achieve  accuracy  to  five  significant  figures  in  all 
cases  considered  required  fifteen  terms;  and 


2.  The  exact  values  are  in  close  agreement  with 


the  approximate  values  obtained,  using  the  asymptotic 
expansion  in  Chapter  VI,  even  for  comparatively  small 
values  of  n. 


TABLE  5-1 

EVALUATION  OF  c.d.f.  OF  THEOREM  5.2  for  m  TERMS 


m 

n  =  10 

n  =  20 

1 

.  2160599 

.4839011 

2 

. 5146427 

.8182633 

3 

. 7491630 

.9495748 

4 

.8868061 

.9881090 

5 

.9539563 

.9975087 

6 

.9827359 

.9995229 

7 

.9939364 

.9999149 

8 

.9979812 

.9999857 

9 

.9993571 

.9999977 

10 

.9998028 

.9999996 

11 

.9999414 

.9999999 

12 

.9999831 

1.0000000 

13 

.9999952 

1.0000000 

14 

.9999987 

1.0000000 

15 

.9999996 

1.0000000 

Summary 

In 

this  chapter  we  obtained 

the  sampling  distribu- 

tions  of  our  test  criteria  A^,  A^  and  Note  that  the 

distributions  of  both  A„  and  A,  are  linear  combinations  of 


Chi-Square  distributions.  The  distribution  of  A 2  resulted 
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in  a  beta  distribution  with  parameters  k(n-l)  -1  and 
k-1.  From  the  distributions,  percentage  points  for  differ¬ 
ent  sample  sizes  and  k  populations  can  be  computed. 

Providing  tables  for  numerous  populations  and 
various  sample  sizes  would  be  inconvenient  and  time- 
consuming;  therefore,  a  good  approximation  for  moderate 
sample  sizes  would  be  extremely  beneficial.  It  is  this 
topic  that  we  treat  in  the  next  chapter. 
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VI.  Asymptotic  Approximation  to 


the  Distribution 


Although  the  tables  of  percentage  points  fill  a 
gap  and  meet  some  of  the  needs  of  statisticians,  approxi¬ 
mations  to  the  distributions  of  the  various  criteria  are 
the  only  practical  means  for  computing  the  observed  signi¬ 
ficance  probabilities  in  the  analysis. 

In  this  chapter  we  develop  an  asymptotic  expansion 

to  the  distributions  of  Aq  and  A  with  the  second  term  of 

-2 

the  order  m  so  that  the  first  term  alone  should  provide 
a  powerful  approximation  to  the  percentage  points  of  Aq  and 
A^  even  for  comparatively  small  values  of  n. 

Asymptotic  Expansion  of  the 
Distribution  of  Aq 

Let  Mq  =  -  2p  log  Aq  where  p  is  an  arbitrary  con¬ 
stant  to  be  chosen  later.  From  equation  (5.1),  substituting 
-  2  pit  for  h,  the  characteristic  function  of  Mq  is  given  by 


4>m  (t)  =  E(A0“2plt)  =  kq  C(t) 


(6.1) 


where 


v  =  nnk  -  2) 

0  '  inn  -2>Jk 


(6.1a) 


and 


r/..  _  k~2pitkn  [T { pn (l-2it )  +  (n-pn-2)}]k 
1  '  r { pnk (l-2it)  +  (nk-pnk-2)T 


(6.1b) 


Let  T  =  m(l-2it)  where  m  =  pn.  Then  we  have 


log  C(t)  =  (Tk  -  pnk)  log  k  +  k  log  F{ T  +  (n- pn-2)} 

-  logf{kT  +  (nk-pnk-2)}  (6.2) 

Using  the  asymptotic  expansion  formula  (5.5)  in  (6.2), 
we  have  the  asymptotic  expansion  of  C(t). 


log  C(t) 


(-nk  +  §) 
u 

+  I 
r=l 


log  k 


.  (k-1 ) 

2 


R' 

u+1 


log ( 2v ) 


v  log  T 

(6.3) 


5 

where  v  =  -^(k  -  1) 

2^ 

and  A  =  — — - -  [B  .  (nk  -  pnk  -  2)  -  kr  +  1 

r  r(r+l)kr  r  +  i 

Br  +  1(n  -  pn  -  2)]  (6.3a) 

,  _  2 
Thus  the  asymptotic  expansion  of  C(t)  up  to  the  order  m 

is  given  by 


C(t) 


k-1.  ,,  5. 

— = — )  -  (kn  -  j) 

( 2  7i )  k  z  T 


v2  -3 

-4  +  0 (m  J)] 

T 


(6.4) 
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MQ  is  given  by 


y  ?  -  3 

A,  (t)  =  (1  -  2it)"v  [  1  +  -y - - - p  +  0(m  )  ] 

M0  m'  ( 1  -  2it)  ^ 

Q2 

[1  +  -y - - -  +  0(m  *'  )  ] 

m  (1  -  2it) 

=  (1  -  2it)"v  +  ^|  [ (1  -  2it)'  v~  2  - (1  -  2it ) “vj 

m 

+  O  (m  "3 )  (6.9) 

Therefore,  inverting  the  characteristic  function  (6.9), 
we  have  the  following  theorem: 

Theorem  6.1:  Under  the  hypothesis  ( 3 . 1 i )  the 

asymptotic  expansion  of  MQ  =  - 2p log  AQ  up  to  the  order 
-2  . 

m  xs  given  by 

2 

P(Mq  >x)  =  P(x2v>x)  +  -§ 

m 

[P(x2(v+2)  -x)  '  P(x2v-x)  ]  +  0(m  3) 

(6.10) 

where  v  =  ^(k  -  1)  m  =  pn  and  p  is  as  in  (6.8). 

Remark:  Since  the  second  term  m  (6.10)  is  of  the  order 

-2 

m  ,  the  first  term  alone  provides  a  powerful  approximation 
to  the  percentage  points  of  Ag  as  seen  from  Table  6-1  and 


Table  6-2. 


TABLE  6-1 

COMPARISON  OF  APPROXIMATION  AND  EXACT  DISTRIBUTION 

a  =  .01 


'Vsk 

2 

3 

4 

5 

6 

3 

.04028* 

.06222 

.08586 

.10960 

.13324 

.03763 

.05775 

.07256 

.08432 

.09378 

4 

.16708 

.18264 

.  20110 

.  21897 

.23597 

.17302 

.19377 

.21094 

. 22498 

. 23630 

5 

.29807 

.30775 

.32173 

.33460 

.34631 

.30197 

.31600 

.33081 

.34372 

.35433 

10 

.62902 

.62832 

.63505 

.64194 

.64805 

.62951 

.62962 

.63617 

.64325 

.64948 

15 

.75051 

.74821 

. 75236 

. 75696 

. 76115 

. 75063 

. 74868 

. 75257 

.75726 

. 76151 

25 

.84963 

.84730 

.84956 

.85229 

.85483 

.84965 

.84746 

.84952 

.85229 

.85486 

50 

.92466 

.92311 

.92414 

.92549 

.92676 

.92465 

.92317 

.92409 

.92545 

.92674 

75 

.94975 

.94863 

.94929 

.95018 

.95103 

.94974 

.94867 

.94925 

.95015 

.95101 

100 

.96230 

.96143 

.96192 

.96259 

.96322 

.96230 

.96146 

.96189 

.96256 

.96321 

*The  upper 

number  is 

the  exact 

value  Lq  =  Xq 

1/kn 

The  lower 

number  is 

the  approximation  value 

• 
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TABLE  6-2 


COMPARISON  OF  APPROXIMATION  AND  EXACT  DISTRIBUTION 

a  =  .05 


n\k 

2 

3 

4 

5 

6 

3 

.08728* 

.10343 

.12627 

.15082 

.17639 

.09006 

.10526 

.11732 

.12658 

.13395 

4 

.26961 

.26141 

. 26769 

.27759 

.28911 

.27594 

.27374 

. 28050 

.28746 

.29365 

5 

.41201 

.39565 

.39639 

.40047 

.40560 

.41527 

.40273 

.40511 

.40963 

.41425 

10 

. 71171 

.69325 

.69007 

.69044 

.69185 

. 71201 

.69402 

.69111 

.69161 

.69310 

15 

.81011 

. 79553 

.79251 

.  79234 

. 79306 

.81016 

.  79572 

.79279 

.79265 

. 79340 

25 

.88730 

.87748 

.87523 

.87491 

.87523 

.88729 

.87751 

.87528 

.87496 

.87529 

50 

.94414 

.93884 

.93755 

.93730 

.9374  2 

.94413 

.93884 

.93755 

.93730 

.93742 

75 

.96287 

.95926 

.95836 

.95817 

.95824 

.96286 

.95925 

.95835 

.95817 

.95823 

100 

.97220 

.96945 

.96876 

.96862 

.96867 

.97219 

.96945 

.96876 

.96861 

.96866 

*The  upper 

nuinber  is 

the  exact 

value  Lq  =  ,\qX 

/kn 

The  lower 

number  is 

the  approximation  value. 
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of  the 


Asymptot ic  Expansion 
Distribution  of  X. 

Let  =  -  2q  log  X^,  where  q  is  an  arbitrary  con¬ 
stant  to  be  chosen  later.  From  equation  (5.15)  substituting 
-  2  qit  for  h  the  characteristic  function  of  is  given  by 


<I>M  <t>  =  K1  C^t) 


where 


K1  = 


r{MnzJLl}r{Mn^l_l} 


[r  <^> 


r(^)Jk 


(6.11) 


(6.11a) 


and 


C  (t)  -  k-2qitkn  (F{|(l-2qit)  ~  }k 

r{^y  (1  -  2  qit)  -  |} 

[P{|(1  -  2  qit)-  l}]k 
r{~(l  "  2  qit)  - 


(6.11b) 


Let  T  =  ^(1  -  2  it)  and  applying  the  asymptotic  expansion 
formula  (5.5)  to  the  log  C^(t)  in  (6.11b)  we  have  after- 
some  algebra 


log  C^(t)  =  (k  -  1)  log  (2-;!)  +  (-kn  +  k+  -|)  log k-  v  log  T 

u  A 

+  1  "F  +  Ru+l  (6a2) 

r=l  T  u  i 


where  v  =  (k  -  1 ) 


and  A_ 


r  (r  +  1 )  k 


.  R  (kn  -  k  -  <]kn_-  i 
r  +  1  1  2  ' 


-  kr+1{Br+1(S^f2-^1 


+  Br+1« 


n  -  qn  -  2 


)>] 


(6.12a) 


Thus  the  asymptotic  expansion  of  C-^(t)  up  to  the  order 
-2  . 

m  is  given  by 

C^t)  =  (2tt)  (k  “  1)k("  kn  +  k  +  2)T_V[1  +  ^  +  — |+  0(m“3)  ] 

o  (6.13) 


where 


Q1  =  A^  and  Q2  =  A2  +  — j—. 


Now  from  equation  (6.11a)  where  m  =  we  have 


r  (km  +  kn  -k-_qkn)  r  (km  +  kn^JSj^L^kn) 

K1  =  ,  n-an-1,  71  ,  n  -  on  -  2,  '  '  (6,14) 


[r  (m  +  — 


-)  i’  (m  + 


-) 


Again  applying  the  asymptotic  expansion  formula  (5.5)  to 
log  in  (6.14)  we  have 

3 

log  K.j  =  ( 1  -  k)  log  (  2 it  )  +  v  log  m  +  (kn  -  k  -  2 )  log  k 


u  A ' 

+  >;  — —  +  R"  ,  . 

,  r  u+1 
r=l  m 


(6.]  5) 


5  J 


where 


A  '  =  -  A  . 

r  r 


And  so  the  asymptotic  expansion  of  up  to  the  order  in 
is  given  by 


-2 


K,  = 


where  A 1 ,  = 


(k-1)  v  (kn-k-f)  Q\  Q'2  -3 

(2:r)'k  m  k  2  [1  +  +  —  +  0(m  3)  ] 

m 

(6.16) 


-  A^  and  Q  '  =  -  A  ^  + 


(-  Aj_) 


Now  choose  q  such  that  A  =0.  Then 


q  =  i  - 


31k  +  13 
18  nk 


(6.17) 


From  (6.11),  (6.13)  and  (6.16)  the  characteristic  function 

of  is  given  by 


*  It)  =  (1- Sit)-*  11  + -5-  2 

1  m  (1  -  2  it) 


+  0 (m  3)  ] 


Q,2  -3 

[1  +  - - -  +  0  (m  ■*)] 

in  (1  -  2  it) 


Q 

=  (1  -2  it)  -v  +  [  (1  -  2  it)  “V  “  2  -  (1-  2  it)  ~V] 

in 


+  O  ( m** 3 ) 


(6.18) 


6  0 


£ 

Therefore,  inverting  the  characteristic  function  YM^(t) 
in  (6.18)  we  have  the  following  theorem: 

Theorem  6.2:  Under  the  hypothesis  ( 3 . 1 i i )  the 

-2 

asymptotic  expansion  of  =  -  2q  log  X^  up  to  order  m 
is  given  by 

2 

P(MX  >  x)  =  P ( x 2 v  L  x )  +  -f 

m 

iP(X2(v+  2)  iX>  -  P(X2vlXl1 

+  0(m~3)  (6.19) 

-2 

Remark:  The  second  term  in  (6.19)  is  of  the  order  of  m 

and  so  the  first  term  alone  should  provide  a  good  approxi¬ 
mation  to  the  percentage  points  of  X^,  even  for  relatively 
small  sample  sizes  as  shewn  in  Tables  6-3  and  6-4. 
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TABLE  6-3 


COMPARISON  OF  APPROXIMATION  AND  EXACT  DISTRIBUTION 

a  =  .01 


2 

3 

4 

5 

6 

3 

.06038 

.09887 

.12372 

.15609 

.18748 

.04557 

.06709 

.08424 

.09869 

.11041 

4 

. 22992 

.25183 

. 27559 

.  29787 

.31870 

.22829 

.25274 

.27509 

.29221 

.30661 

5 

.37893 

.  39586 

.41467 

.43083 

.44484 

.37882 

.39752 

.41731 

.43273 

.44585 

10 

.69895 

. 70546 

.71541 

.  72409 

.73131 

.69934 

.70567 

.  71585 

.  72419 

. 73150 

15 

.80287 

.80655 

.81311 

.81894 

.82382 

.80317 

.80662 

.81329 

.81886 

.82379 

25 

.88359 

.8854  7 

.88934 

.89284 

.89578 

.88378 

.88549 

.88942 

.89274 

.89571 

50 

.94253 

.94334 

.94525 

.94699 

.94846 

.94263 

.94334 

.94528 

.94693 

.94847 

75 

.96185 

.96236 

.96363 

.96479 

.96577 

.96192 

.96237 

.96365 

.96475 

.96574 

100 

.97145 

.97182 

.97277 

.97364 

.97437 

.97150 

.97183 

.97278 

.97361 

.97435 

*The  upper 

number  is 

the  exact 

value 

1  /kn 

The  lower 

number  is 

the  approximation  value 

a 
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TABLE  6-4 


COMPARISON  OF  APPROXIMATION  AND  EXACT 

a  -  .05 


DISTRIBUTION 


n\k 

2 

3 

4 

5 

6 

3 

.13239 

.15090 

.17995 

.21108 

.  24261 

.11862 

.  13208 

.14  541 

.15637 

.16530 

4 

.36242 

.35480 

.36258 

.37372 

.38636 

.36075 

.35680 

.36466 

.37315 

.38073 

5 

.51231 

.49968 

.50294 

. 50869 

. 51484 

.51171 

.50095 

.50506 

. 51111 

. 51695 

10 

.78134 

.77006 

.  76992 

.  77192 

.  77434 

.78126 

.  77012 

.  77006 

.77216 

.  77462 

15 

.85964 

.85128 

.85083 

.85199 

.85350 

.85960 

.85127 

.85084 

.85204 

.85356 

25 

.91828 

.91292 

.91249 

.91311 

.91396 

.91826 

.91290 

.91247 

.91310 

.91397 

50 

.96005 

.95725 

.95698 

.95726 

.95767 

.96004 

.95724 

.95697 

.95725 

.95766 

75 

.97356 

.97168 

.97148 

.97166 

.97193 

.97356 

.97167 

.97147 

.97165 

.97192 

100 

.98025 

.97882 

.97867 

.97880 

.97900 

.98024 

.97881 

.97866 

.97880 

.97900 

*The  upper 

number  is 

the  exact 

value  1,^  = 

1/kn 

The  lower 

number  is 

the  approximation  value 

• 

VII.  Practical  Illustration 


The  Engineering  and  Design  Group,  AFWAL/MLSE, 
Wright-Patterson  AFB,  Ohio,  provided  the  data  used  in  the 
following  analysis. 

The  ultimate  tensile  strength  of  a  metal  alloy  is 
characterized  by  two  correlated  variables,  x  and  y,  longi¬ 
tudinal  hardness  and  transversal  hardness,  respectively. 

Test  procedures: 

1.  Three  companies  received  a  sample  block  of  metal 
alloy  IN-9021,  hand  forged. 

2.  Each  company  used  a  common  set  of  test  condi¬ 
tions  and  conducted  identical  tests  in  accordance  with 
ASTM  testing  standards. 

3.  Three  measurements  of  longitudinal  hardness  and 
three  measurements  of  transversal  hardness  were  obtained  by 
each  company. 

The  collected  data  is  as  shown  in  Table  7-1,  by 
company,  where  L  represents  longitudinal  hardness  ksi 
and  T  represents  transversal  hardness  ksi. 

We  shall  first  test  Hq ,  the  hypothesis  that  there 
is  no  significant  difference  between  the  populations. 

A  summary  of  the  necessary  calculations  is  shown  in  Table 
7-2,  where  the  number  of  populations,  k  =  3,  and  the  number 
of  observations,  n  =  3. 
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TABLE  7-1 


TENSILE  STRENGTH:  VARIABLES  L  AND  T 


General 

Dynamics 

Lockheed 

Rockwell 

L 

T 

L 

T 

L 

T 

87.3 

85.9 

1  i 

89.0 

87.4 

87.6 

86.0 

84.4 

86.0 

89.6 

87.3 

86.3 

84 . 7 

89.8 

84.8 

89.6 

87.3 

83.9 

84 . 7 

TABLE  7-2 

SUMMARY  OF  CALCULATIONS 


IAJ 

|A2 |  i Ag I 

~JE\  " 

Ha+bI 

2.88115 

• 

00120  3.24480 

43.1704 

282.74248 

From  equation 

(3.15)  with  L^  =  Xg 

1/kn  , 

we  have 

L0 

—  V 

[  h  \a\)^ 

Q=1  ^ 

I  l  |*J]5 

q  =  l  y 

.0836 

1  “  3 

1 

1 A  +  B  j  2 

1 A  +  B  |  2 

From  Table 

I  in 

the  appendix  with 

a  =  .05,  n  - 

3,  k  =  3, 

LQ  =  .10343 

• 

Decision  Rule:  Reject  Hg 

if  L*  <  I. 

0  —  '‘,n, 

k  ’ 

Since  L*  < 

.10343,  we  reject  Hg  at 

the  u  -  .05 

signif  j.cance 

level  and  conclude  the  populations  as  regard  to  tensile 
strength  are  not  identical. 


b  i) 


We  now  proceed  to  test  ,  the  hypothesis  that 
there  is  no  significant  difference  in  the  populations  as 
regards  variance  and  covariance  in  the  variables  x  and  y. 
From  equation  (3.16)  with  L*  =  x  we  have 


.  2087 


From  Table  II  in  the  appendix  with  a  =  .0  5,  n  =  3,  k  =  3  ; 


1^  =  .1509. 

By  the  Decision  Rule,  since  L|  >  .1509  we  do  not 
reject  that  the  variances  and  covariance  are  the  same 

at  a  =  .05  significance  level. 

We  may  further  test  H the  hypothesis  that  the 
means  are  the  same  among  the  populations  given  that  the 


From  Table  III  in  the  appendix  with  a  =  .05,  n  =  3,  k  =  3, 

L2  =  .4182. 

The  Decision  Rule  leads  us  to  reject  and  conclude 
that  the  means  among  the  populations  are  not  equal.  This 
is  consistent  with  our  previous  conclusion  concerning  Hq. 


In  summary,  the  conclusions  are,  the  sample  ingots 
of  metal  alloys  received  by  the  companies  have  equal  vari¬ 
ances  and  covariances  concerning  the  two  attributes, 
longitudinal  and  transversal  hardness,  but  the  means  differ 
at  the  5  percent  significance  level. 


VIII.  Conclusion 


In  this  thesis  we  obtained  the  exact  distribution 
for  our  test  criteria  A  ^ ,  A  and  A To  do  this  we  assumed 
that  our  samples  were  drawn  from  populations  distributed 
as  Np(x[u^,Z^)  and  then  restricted  the  development  of  the 
sampling  distributions  to  the  case  of  equal  sample  sizes 
and  two  variables.  From  the  exact  distributions  we  were 
able  to  obtain  tables  of  percentage  points  whic  .  enables 
one  to  test  the  hypotheses  considered  in  this  thesis. 

The  asymptotic  approximations  to  the  distribution 
of  Aq  and  A^  extended  our  testing  ability  to  sample  sizes 
and  populations  not  covered  by  tables.  Tables  of  com¬ 
parisons  in  Chapter  VI  showed  that  the  asymptotic  expansion 
yields  powerful  approximations  to  the  percentage  points  of 
the  test  statistics. 

The  importance  of  multivariate  analysis  is  illus¬ 
trated  by  the  many  entities  that  require  several  traits  to 
describe  their  characteristics.  Testing  all  the  attributes 
simultaneously  is  necessary  because  multiple  correlations 
may  exist  among  the  variables.  For  example,  the  quality 
of  a  relay  might  be  accurately  characterized  by  three 
variables;  capacitance,  inductance  and  resistance,  a  metal 
alloy  may  require  the  variables;  shear  strength  and 


compression  strength  in  addition  to  tensile  strength  to 
adequately  describe  its  quality. 


The  application  of  this  theory  to  practical  prob¬ 
lems  is  unlimited  including  areas  such  as  agriculture, 
anthropology,  economics,  physics,  industry,  medicine  and 
sociology,  to  name  a  few. 

In  light  of  this,  areas  of  further  study  include 
extending  multivariate  methods  to  more  than  two  variables 
and  unequal  sample  sizes  in  obtaining  the  sampling  dis¬ 
tributions.  Also,  in  order  to  study  the  power  of  the 
tests  it  would  be  worthwhile  to  develop  methods  to  obtain 
the  non-null  distributions  of  the  criteria. 
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N 

2 

3 

4 

5 

6 

3 

29.2635 

40.8391 

40.6644 

56.7496 

62.4613 

4 

20.9726 

32 .2001 

42.1739 

51.2641 

59.5659 

5 

17.1743 

27.8168 

37.0143 

45.7554 

54.1430 

6 

16.0926 

25.  5186 

34.1449 

42.391 3 

50.3931 

7 

15.1011 

24.1174 

32.3789 

40.2835 

47.9673 

8 

14.4371 

23.1736 

31.1861 

38.8544 

46.3109 

9 

13.9611 

22.4943 

30.3261 

37.8228 

45.1132 

10 

13.6032 

21.9818 

29.6764 

37.0430 

44.2071 

1  1 

13.3242 

21.5812 

29.1682 

36.4327 

43.4978 

12 

13.1007 

21.2596 

28.7598 

35.9419 

42.9272 

13 

12.9174 

20.9955 

28.4243 

35.5387 

42.4533 

14 

12.7646 

20.7748 

28.1438 

35.2015 

42.0661 

1  5 

12.6351 

20.5877 

7  7  .  Q  0  5  8 

34,9153 

41.733? 

10 

12.5240 

20.426 9 

27.701? 

34.669 3 

41.4470 

17 

12.4277 

20.2874 

27.5236 

34.5556 

41.1984 

18 

12.3433 

20.1651 

27.3678 

34.2682 

40.9804 

19 

12.2688 

20.0570 

27.2302 

34.1025 

40.7876 

20 

12 .202  5 

19.9608 

27.1077 

33.9551 

40.6160 

21 

12.1432 

19.8747 

26.9979 

33.8230 

40.4622 

22 

12.0898 

19.7970 

26.8989 

33.7038 

40.3236 

23 

12.0415 

19.7267 

26.8093 

33.5960 

40.1980 

24 

11.9975 

19.6628 

26.7278 

33.497 P 

40.0837 

2  5 

11.9574 

19.6043 

26.6532 

3  3 .4080 

39.9792 

30 

11.7995 

19.3743 

26.3508 

33.0540 

39.5678 

35 

1  1 .6  894 

19.2136 

26.1547 

32.8075 

.3  9.28  0  0 

40 

1  1 .60<31 

19.0949 

26.0132 

32 . 624  5 

39.0675 

4  5 

11.5458 

19 .0038 

25.8867 

32.4846 

38 .9040 

50 

11.4963 

18.9315 

25.7944 

32.3734 

38.7743 

55 

11.4 5 6 2 

18.8728 

25.7194 

3  2  .  2  8  3  C 

93.6691 

6  0 

11.4230 

18.8242 

25.6573 

3  2  .  ?o<:  i 

3  8  .581 9 

6  5 

11.3951 

13.7833 

2.  8 . 6  0  5  0 

32  .  1  4  ;■)  i 

3  8  .  5  0  v  5 

70 

11.3713 

1  8.7  4 8 4 

25.5604 

32.091 3 

3  8  .  4  4  5  8 

7  5 

11.3507 

18.7182 

2  5  .  52  1  8 

3  2  .  r<  4  4  8 

3  S  .3417 

8  0 

11.3328 

1 8 .692 0 

2  5  .  4  8  32 

32.004 3 

3  8.  3  4  4  5 

8  5 

11.3170 

18.66 8  8 

25.4586 

3  1.90  8 0 

3  8  .  3  0? 4 

90 

11.3030 

1  8 .6483 

25.4324 

31.9370 

3>; .  2  60  l 

9  5 

1  1 .2905 

18.6300 

25. 4090 

3 1 . 9088 

38.2332 

100 

11.2793 

18.6136 

25.3880 

m 

cC 

cc 

r"> 

38.2037 

8  U 


Table  V .  Percentage  Points  of  V!  ^ 
a  =  .01 
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Table  V.  I'orccntac.o  Points;  of  l.' ^ 
a  *  .OS 


N 

2 

3 

4 

5 

6 

3 

24.2640 

34.0409 

41.1618 

46.6659 

50.9873 

4 

16.2394 

24.8688 

32.4639 

39 . 3697 

45.6477 

5 

13.3765 

20.8134 

27.4916 

33.7962 

39.8344 

6 

1  1  .  9  62  2 

18.7598 

24.8772 

30.6733 

36.2679 

7 

11  . 1208 

17.5269 

23.2976 

28.7676 

34.052 4 

8 

10.5631 

16.7049 

22.2415 

27.4903 

32.5622 

9 

10.1663 

16.1178 

21.4856 

26.5751 

31.4932 

10 

9.8696 

15.6773 

20.9179 

2  5.8  8  7  1 

30 . 6890 

1 1 
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15.3347 
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2  5  .  3  51  0 

30.0621 

12 

9.4555 

15.0606 
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24.9214 
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13 

9 .3053 

14.8363 

1  9  .  S  3  1  8 

2  4  .  5 6^5 
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14 
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16 
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21 

8.6752 
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22 
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23 
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24 
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1  7 .7  649 

22.0567 

26.2047 

4  5 

8.1940 

13.1662 

17.6667 
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